Unsupervised Recognition of Literal and Non-Literal Use of Idiomatic Expressions
نویسندگان
چکیده
We propose an unsupervised method for distinguishing literal and non-literal usages of idiomatic expressions. Our method determines how well a literal interpretation is linked to the overall cohesive structure of the discourse. If strong links can be found, the expression is classified as literal, otherwise as idiomatic. We show that this method can help to tell apart literal and non-literal usages, even for idioms which occur in canonical form.
منابع مشابه
A Cohesion-based Approach for Unsupervised Recognition of Literal and Nonliteral Use of Multiword Expression
Texts frequently contain expression whose meaning is not strictly literal, such as idioms. Idiomatic and non-literal expressions pose a major challenge to natural language processing technology as they often exhibit lexical and syntactic idiosyncrasies. We propose a novel unsupervised method for distinguishing literal and non-literal usages of expressions. Our method determines how well a liter...
متن کاملLiteral or idiomatic? Identifying the reading of single occurrences of German multiword expressions using word embeddings
Non-compositional multiword expressions (MWEs) still pose serious issues for a variety of natural language processing tasks and their ubiquity makes it impossible to get around methods which automatically identify these kind of MWEs. The method presented in this paper was inspired by Sporleder and Li (2009) and is able to discriminate between the literal and non-literal use of an MWE in an unsu...
متن کاملUnsupervised Type and Token Identification of Idiomatic Expressions
Idiomatic expressions are plentiful in everyday language, yet they remain mysterious, as it is not clear exactly how people learn and understand them. They are of special interest to linguists, psycholinguists, and lexicographers, mainly because of their syntactic and semantic idiosyncrasies as well as their unclear lexical status. Despite a great deal of research on the properties of idioms in...
متن کاملClassifier Combination for Contextual Idiom Detection Without Labelled Data
We propose a novel unsupervised approach for distinguishing literal and non-literal use of idiomatic expressions. Our model combines an unsupervised and a supervised classifier. The former bases its decision on the cohesive structure of the context and labels training data for the latter, which can then take a larger feature space into account. We show that a combination of both classifiers lea...
متن کاملDetecting and Processing Figurative Language in Discourse
Figurative language poses a serious challenge to NLP systems. The use of idiomatic and metaphoric expressions is not only extremely widespread in natural language; many figurative expressions, in particular idioms, also behave idiosyncratically. These idiosyncrasies are not restricted to a non-compositional meaning but often also extend to syntactic properties, selectional preferences etc. To d...
متن کامل